Optimal pattern matching algorithms

نویسنده

  • Gilles Didier
چکیده

We study a class of finite state machines, called w-matching machines, which yield to simulate the behavior of pattern matching algorithms while searching for a pattern w. They can be used to compute the asymptotic speed, i.e. the limit of the expected ratio of the number of text accesses to the length of the text, of algorithms while parsing an iid text to find the pattern w. Defining the order of a matching machine or of an algorithm as the maximum difference between the current and accessed positions during a search (standard algorithms are generally of order |w|), we show that being given a pattern w, an order k and an iid model, there exists an optimal w-matching machine, i.e. with the greatest asymptotic speed under the model among all the machines of order k, of which the set of states belongs to a finite and enumerable set. It shows that it is possible to determine: 1) the greatest asymptotic speed among a large class of algorithms, with regard to a pattern and an iid model, and 2) a w-matching machine, thus an algorithm, achieving this speed.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Average-optimal string matching

The exact string matching problem is to find the occurrences of a pattern of length m from a text of length n symbols. We develop a novel and unorthodox filtering technique for this problem. Our method is based on transforming the problem into multiple matching of carefully chosen pattern subsequences. While this is seemingly more difficult than the original problem, we show that the idea leads...

متن کامل

Optimal Exact and Fast Approximate Two Dimensional Pattern Matching Allowing Rotations

We give fast filtering algorithms to search for a 2– dimensional pattern in a 2–dimensional text allowing any rotation of the pattern. We consider the cases of exact and approximate matching under several matching models, improving the previous results. For a text of size n× n characters and a pattern of size m×m characters, the exact matching takes average time O(n log m/m), which is optimal. ...

متن کامل

A Comparative Study of Different Longest Common Subsequence Algorithms

The longest common subsequence is a classical problem which is solved by using the dynamic programming approach. The LCS problem has an optimal substructure: the problem can be broken down into smaller, simple "subproblems", which can be broken down into yet simpler subproblems, and so on, until, finally, the solution becomes trivial. The LCS problem also has overlapping subproblems: the soluti...

متن کامل

Approximate Geometric Pattern Matching Under Rigid Motions

We present techniques for matching point-sets in two and three dimensions under rigid-body transformations. We prove bounds on the worst-case performance of these algorithms to be within a small constant factor of optimal, and conduct experiments to show that the average performance of these matching algorithms is often better than that predicted by the worst-case bounds.

متن کامل

Optimally fast parallel algorithms for preprocessing and pattern matching in one and two dimensions

All algorithms below are optimal alphabet-independent parallel CRCW PRAM algorithms. In one dimension: Given a pattern string of length m for the string-matching problem, we design an algorithm that computes a deterministic sample of a suu-ciently long substring in constant time. This problem used to be a bottleneck in the pattern preprocessing for one-and two-dimensional pattern matching. The ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1604.08437  شماره 

صفحات  -

تاریخ انتشار 2016